Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Filebeat] Kafka input, json payload #26833

Closed

Conversation

mjmbischoff
Copy link
Contributor

@mjmbischoff mjmbischoff commented Jul 11, 2021

What does this PR do?

It allows the Filebeat Kafka input to handle json. Specifically this enables picking up structured data and exposing it under top level fields in stead of having escaped json in the message field.

Why is it important?

Kafka is often used to pull data away from the log sources as fast as possible to avoid disks filling up and to allow the 'backend of the pipeline to be serviced / incidents be handled, without dropping events on the floor.

This avoid the need for one to apply the decode-json-fields processor immediately after the input to be able to process any of the fields in the structured data.

In the context of modules this change can be a big advantage; right now we can override the input but not change the processors used by the module easy or inject a processor between the input and the module. While this doesn't solve the issue of mismatched structure, it at least allows one to transform the data before it's stored in Kafka so that modules can be used post Kafka.

Checklist

  • My code follows the style guidelines of this project
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have made corresponding change to the default configuration files
  • I have added tests that prove my fix is effective or that my feature works
  • I have added an entry in CHANGELOG.next.asciidoc or CHANGELOG-developer.next.asciidoc.

@botelastic botelastic bot added the needs_team Indicates that the issue/PR needs a Team:* label label Jul 11, 2021
@botelastic botelastic bot removed the needs_team Indicates that the issue/PR needs a Team:* label label Jul 11, 2021
@elasticmachine
Copy link
Collaborator

💚 Build Succeeded

the below badges are clickable and redirect to their specific view in the CI or DOCS
Pipeline View Test View Changes Artifacts preview preview

Expand to view the summary

Build stats

  • Start Time: 2021-07-11T01:33:02.013+0000

  • Duration: 101 min 31 sec

  • Commit: ad02f7b

Test stats 🧪

Test Results
Failed 0
Passed 14806
Skipped 2312
Total 17118

Trends 🧪

Image of Build Times

Image of Tests

💚 Flaky test report

Tests succeeded.

Expand to view the summary

Test stats 🧪

Test Results
Failed 0
Passed 14806
Skipped 2312
Total 17118

@mjmbischoff mjmbischoff changed the title filebeat: Kafka input json payload [Filebeat] Kafka input, json payload Jul 11, 2021
@mjmbischoff mjmbischoff added the Filebeat Filebeat label Jul 11, 2021
@kvch
Copy link
Contributor

kvch commented Jul 14, 2021

@mjmbischoff We have an Filebeat wide initiative to expose the same parsers in all inputs like we have in log/filestream: multiline, json and container tracked here: #26130

Do you mind looking into it and implementing the support for the input? We want to have a uniform parsing experience for all Filebeat inputs, so I am afraid this PR cannot go in as is.

@mjmbischoff
Copy link
Contributor Author

Still on my radar, looking into implementing based on parsers. Do hit some snags as there doesn't seem to be an easy way to avoid string-> json -> string -> parser(ndjson) dance. Also parser seems pull based and the kafka is more setup as push based code wise. Going to take some changes.

@botelastic
Copy link

botelastic bot commented Aug 26, 2021

Hi!
We just realized that we haven't looked into this PR in a while. We're sorry!

We're labeling this issue as Stale to make it hit our filters and make sure we get back to it in as soon as possible. In the meantime, it'd be extremely helpful if you could take a look at it as well and confirm its relevance. A simple comment with a nice emoji will be enough :+1.
Thank you for your contribution!

@mjmbischoff
Copy link
Contributor Author

#27335 superspeeds this PR, expect it will be the one to get merged. Removing stale.

@mergify
Copy link
Contributor

mergify bot commented Sep 6, 2021

This pull request is now in conflicts. Could you fix it? 🙏
To fixup this pull request, you can check out it locally. See documentation: https://help.github.com/articles/checking-out-pull-requests-locally/

git fetch upstream
git checkout -b kafka-input-json-payload upstream/kafka-input-json-payload
git merge upstream/master
git push upstream kafka-input-json-payload

@mjmbischoff
Copy link
Contributor Author

Closing this as #27335 got merged

@mjmbischoff mjmbischoff closed this Sep 6, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants